An Artificial Language Evaluation of Distributional Semantic Models
نویسندگان
چکیده
Recent studies of distributional semantic models have set up a competition between word embeddings obtained from predictive neural networks and word vectors obtained from count-based models. This paper is an attempt to reveal the underlying contribution of additional training data and post-processing steps on each type of model in word similarity and relatedness inference tasks. We do so by designing an artificial language, training a predictive and a count-based model on data sampled from this grammar, and evaluating the resulting word vectors in paradigmatic and syntagmatic tasks defined with respect to the grammar.
منابع مشابه
Question Answering Based on Distributional Semantics
An NLP application for question answering provides an insight into computer’s understanding of human language. Many areas of NLP have recently built on deep learning and distributional semantic representation. This paper seeks to apply distributional semantic models and convolutional neural networks to the question answering task.
متن کاملFinnish resources for evaluating language model semantics
Distributional language models have consistently been demonstrated to capture semantic properties of words. However, research into the methods for evaluating the accuracy of the modeled semantics has been limited, particularly for less-resourced languages. This research presents three resources for evaluating the semantic quality of Finnish language distributional models: (1) semantic similarit...
متن کاملPolish evaluation dataset for compositional distributional semantics models
The paper presents a procedure of building an evaluation dataset1. for the validation of compositional distributional semantics models estimated for languages other than English. The procedure generally builds on steps designed to assemble the SICK corpus, which contains pairs of English sentences annotated for semantic relatedness and entailment, because we aim at building a comparable dataset...
متن کاملModeling Semantic Compositionality of Croatian Multiword Expressions
A distinguishing feature of many multiword expressions (MWEs) is their semantic non-compositionality. Determining the semantic compositionality of MWEs is important for many natural language processing tasks. We address the task of modeling semantic compositionality of Croatian MWEs. We adopt a composition-based approach within the distributional semantics framework. We build and evaluate model...
متن کاملRunning head: SEMANTIC COHERENCE AND DISTRIBUTIONAL LEARNING 1 Semantic coherence facilitates distributional learning
Computational models have shown that purely statistical knowledge about words’ linguistic contexts is sufficient to learn many properties of words, including syntactic and semantic category. For example, models can infer that “postman” and “mailman” are semantically similar because they have quantitatively similar patterns of association with other words (e.g., they both tend to occur with word...
متن کامل